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DUPLICATE 
i 

Computer based system for digital media preparation. 
Technical Field 

This invention xekecs to a computer software system for the {preparation of digital media. 
Background Art 

5 Application software for editing digital video is an extremely sophisticated and powerful tool 
because it is primarily designed for* and sold to, die video professional Such an individual 
requires access to many complex functions and fa prepared to invest time and effort in 
learning to. become skilled in their use. Historically, the texcoinology and conventions of 
Digital Editing have evolved from a traditional film editing environment where rushes axe cut 
10 and spliced together to tell a story or follow a script. As digital mixer technology advanced 
new technique? were combined with these conventional methods to form the early 
pioneering software based digital editors. 

To the video or film professional. editing is second nature and tihe completes of a time- 
based media go unnoticed siq.ee, Jt^viog alteady grasped concepts and learned processes., they 
15 are able to concentrate on the nuances of different editing packages, of which thctc Ate 
many. 

Conventionally these packages, through the use of a Graphical User Interface (GUI), attempt 

to provide an abstraction of the media, in terms of many separate tracks of video and audio. 

These are represented on the output device in symbolic feshion and provision is made for 
20 interacting with these representations using an input device such as a mouse. Typically the 

purpose is to create a new piece of media as an output file, composed by assembling clips or 
segments of video and audio along a timeline that represents tine temporal ordering of 

frames. Special effects such as wipes and fades can be incorporated, transparent overlays can 

be added, colour and contrast can be adjusted The list of manipulations made possible by 
75 such tools is very long indeed. A typical system is described in, for example, Foreman; Kevin 

J., et. al, " Graphical uset interface for a. video editing system", US* Patent. 6,469,711. 

It is possible, however, that an individual, who is a consumer of media, rather than a 
producer, may need co perform a simple editing operation on a media file in order to 
accomplish their primacy task; for example tp give a multi-media presentation. In this ease 
30 such tools have cheir drawbacks. They may be too expensive to justify individually, or to 
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have enough of in order to be available when or where needed* The limited amount of use • 
and the small ft&ction of the capabilities used La such situations may make them uneconomic 
The steep learning curve associated with Such tools may mean that an inappropriate amount 
- of effort is expended on something that is not the primary occupation or concern of the tool 
5 user. For occasional or infrequent use there will be reluctance on the part of any user 
repeatedly to switch environments or leam and rejeam new tools to perform simple last 
minute tasks* 

Work has been carried out with the view of improving the interaction between a user and a 
video editor by providing 'intelligent 1 operations; The 'Silver 1 project (Juan P» Casarcs. 

10 "SILVER; An Intelligent Video Editor," ACM CHT2Q01 Student Posters. Seattle, WA> 
March 31-April 5, 2G01. PP- 425425) uses 'smart selection' to assist the user to find 'in' and 
'oixt? points. The 'in 1 and p out? points are roughly set by the user and then 'snap 1 to a 
boundary, which could be a shot change or the silence between spoken words, or other 
gimflar Features. Video and audio boundaries typically w3U not line up so the system provides 

15 some 'fbdng-up' functions to smooth the edit boundary. 

Conventionally, video editors arc application programs that run on high-end PCs and 
workstations under desktop-oriented operating, systems such as Microsoft Window or 
Apple's Mac OSX; often with high-resolution screens and high-bandwidth network 
connectivity. The viewing of media files, however, can take place on an ever-expanding list 

20 of devices with many different capabilities, such as laptops, mobile PDA* with wireless 
connectivity, mobile phones, set~top boxes and hard-disc based personal video recorders 
(PVRs), The concept of a simple media manipulation tool integrated into the media player 
component is as relevant in these cases as it is in that of the standard PC, possibly more so 
since, for example, a PVR may not have a run-time environment capable of running external 

25 applications such as video editors. 

Another class of device that is becoming ever more capable of media manipulation is the 
mobile phone. Such devices now have the ability to capture, display and transmit moving 
images, but, conventionally, are not thought of as a platform for editing video. There is no 
reason, however, wby simple editing operations should not be applied here in order to 
30 enhan ce even the simplest and shortest of video presentations. Mobile phones present a 
unfcjuc of challenges to the user Interface component of any application. First and 
foremost the display area is extremely limited and so immediately rules out multi-level 
menus, timelines and story-boards. Secondly, the user interface is extremely constrained: 
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there Is no mouse input, only a few options can be displayed at a time, and all Interaction 
must be performed using a set of navigation buttons (which may vary in position and size 
according to die hardware manufacturer). Thirdly, the use* ejepecte to be able to perform 
any action onc-handed. 

5 Accordingly, these are the attributes of a media preparation tool that is appropriate to the 
needs of such a device. 

* Simple and intuitive to use; in particular, little time and effort js required to learn enough 
to accomplish the task in hand. 

* Efficient use of screen area; no menus, timelines or gtory-boards, 

10 * Efficient use of user input i3al2erfa.ee. 

° Efficient editing model that allows simple trimming operations to be performed simply, 
whilst permitting more complex tasks to be carried out 

* Predicts, where possible, the preferences of the user as regards editing limits. 

Disclosure of Invention 

15 The invention relates to a method called VKV for simple video message preparation, 
analogous to the predictive tc&x editing for mobile TOCTfog, does not use the 

conventional editing semantics of 'in' and 'out' pointsj it determines edit limits using rules 
that are updated through nscr feedback and minimises the typical number of user 
interactions required to perform a aimple video Vrimmm g 1 task. 

20 Briefly, the invention works as follows. 

According to one aspect of the invention a Graphical User Interface (GUI) input interface 
for editing is defined* 

In the preferred embodiment of this aspect of the invention the controls consist of five 
buttons: 

25 * one for video 'forward 1 shuttle; 
■ one for video 'backward* shuttle; 
» one button meaning 'include 1 ; 

* one button meaning 'exclude'; 

* one button meaning "apply 1 . 
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According to another aspect of tiie invention a Gtajxhlcal User Iotexfecc (GUI) output 
interface for editing is defined for feedback to the usct. . 

In the preferred embodiment of this aspect of the invention the graphical elements consist 
o£ 

8 an 'edit tor' graphic on the display; consisting of a coloured rectangular area. 
p a 'frame pointer 1 that marks the current frame on the edit bat 

• an Include 1 graphic which overlays the corresponding frame and consists of 3 green 'tick': 

* an 'exclude 1 graphic which overlays the coxrespo;adiog feme and consists of a red 'cross 8 . 



According to another aspect of the invention the sequence of actions ftom the nscr loading a 
piece of digital *nedia to the uses apply&ag the edits is called a 'jexmn'; the first operation the 
user performs during a session is called the 'initial selection'^ subsequent operations that the 
user performs are called the 'refinement pbase\ a feme or fxanpes that axe in the 6ml edit arc 
IS .^irtekided'i those that arc not ate 'oxfiAuhtt; an opetadon that causes a number of frames to 
change state from 'excluded 1 to 'included 1 or vicc-vcrsa is called a *grv&? operation; the actual 
number of frames that change state from 'excluded 1 to 'included', or vice-versa, during a grow 
operation Is called the 'ntppeit* 

. Accoxdteg to another aspect of the invention means are provided for storing as variables in a 
20 computer memory information about the history of interactions between the user and the' 
video preparation tool; these are called 'sesiwn variables' and assist the user to determine the 
limits of initial selection. 

In the preferred embodiment of this aspect of the invention an integer variable used for 
prediction called p is used aitfomaticatty to determine the number p£ frames labelled as 

25 'included', as a proportion of the initial length of the dip, when the user makes the initial 
selection, When the program is used for the first time ever this variable is set to an arbitrary 
initial value, for example, 4. If the length of the clip in frames is L then die suppott is given 
by s = Up . For example, if s equate 4 and h equals lt)0 then the support s equals 25 frames. 
Therefore, if lie wscr nomtaatcs a particular frame as being 'included' then the system 

30 determines that 25 frames previous, and 25 fteixies subsequent, to this frame., may also be 
included. 
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According to smother aspect of the Invention, means are provided for using and updating the 
'session variables' to assist the user to determine the limits of editing operations that occur 
during the scflncment phase. 

In the preferred embodiment of this aspect of the invention, after an editing session is 
5 complete the actual number of frames (f) included in the final video message is *ead and is 
used to derive a new value of p as follows: p(new) = 2L/f, So. fox example, if the length of 
the final message is 40 frames then the new value of p reflects the fact that fewer frames 
were actually required than were predicted, and the predicted p for the nest edit session 
becomes 200/40 = 5. Assuming an initial length of 100 fromas in the next editing session, a 
10 support value s equal to 20 frames is used 

According to another aspect of the invention means are provided for storing as variables in a 
computer memory information about the history of interactions between, the user and the 
video preparation tool; these are called 'session variables' and assist the user to determine the 
limits of edit operations during the refinement phase. 

In the preferred embodiment of this aspect of the invention a vector of integer variables r# 
is used to model how the user refines the initial edit; the value of r® is equal, to the support 
In frames for the Hb refinement edit and is used to determine the number of frames to 
labelled as 'Included' during refinement phases. 

According to another aspect o£ the invention means are provided fox using and updating the 
'session variables' to assist the user to determine the limits of editing operations that occur 
during the refinement phase. 

In the preferred embodiment of this aspect of the invention, any operation that results in a 
change of state of a frame from 'excluded 1 to 'included' is treated as a new edit and causes the 
index i in r® to increment The value of r{f) is equal to the support obtained during a 'grow 1 
25 operation and is preset to some arbitrary value the very first time the program Is used. For 
opetation i, such a 'grow 1 opetation changes the state of rf) frames to 'included' and any 
further editing within the limits of these frames causes the value of r(i) to be updated 
accordingly. 

According to another aspect of the invention means are ptovided for the us« to select the 
30 region of the video message that is of interest. 
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In die preferred embodiment of this aspect of the invention, the usee operates the 'Forward' 
and 'backward' shuttle buttons to find a representative frame in the part 
of the clip that is 'of most interest 1 . The desired frame is displayed along with smaller, under- 
sampled versions of the previous and following frames, 

5 According to another aspect of the invention means arc provided to feedback to the user, 
without the uger having to preview the edit, frames that are 'included 1 and 'excluded'. 

In the preferred embodiment of this aspect of the Invention the 'edit bar 1 represents the 
' video clip bcii?.g edited and a pointer In the 'edit bar 1 Indicates the frame currendy being 
viewed, Regions of the bar that axe green represent 'included 1 sections; regions that arc red 
10 represent 'excluded 1 sections. Prior to any editing taking place the Esix is completely red, 
meaning that all die frames axe 'excluded'. 

According to. another aspect of tJxe invention, wot.mc provided to feedback to &e user, 
Involving the user previewing the edit, frames that are 'included 1 and 'excluded 1 . 

In the preferred embodiment of this aspect of the invention each frame tiaat is deluded 1 is 
J5 overlaid -with a green r dck' and each frame that is deluded' is overlaid -with a ted cross. The 
user can review- these frames using the forward and backward shuttle controls. 

■ According to another aspect of the i^veution mems axe provided for the user to manipulate 
the region of the video message that is included. 

In the preferred embodiment of this aspect of the invention the user operates the 'include' 
20 button to grow regions of the video clip fox inclusion in the final edit Assuming that the 
user has stopped at a frame in a region of interest the interaction is ad follows: 
» 

• If the 'include' button is pressed once the part of the edit bar under the frame pointer 
goes green to indicate that only the current frame is included; the rest of the bar remains 
unchanged. 

23 • If the 'include* button is pressed once more, a region corresponding to the support 
before and after the frame pointer posidon goes green to indicate -that this region is 
Included; the rest of the bar remains unchanged. 

. • If the include' button is pressed once more. & region from the start of the bar up to the 
pointer and a region corresponding to die support after the frame pointer position goes 
30 green to indicate that all the frames from the beginning of the -video to the current 
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portion are included, and a number of frames after the Current position corresponding 
to the support are jdpo included, 

* If the 'include* button is pressed once more, a region from the end of the bar back to the 
pointer and a region corresponding to the support before the ftarne pointer position 

5 goes green to indicate that all the fcaraes from the cwrcnt position to the end of the 

■video are included, and a number of .frames before the current position corresponding to 

• the support arc also included. 

• Further presses *epe?itedJy cycle round the four above cases. 

In another embodiment of this aspect of the invention the user operates two 'handles 1 on the 
10 edit bar that define the start and end of die included region, respectively 

The user also operates the ^exclude 1 button to grow tegions of the video clip for exclusion 
from the final edit Assuming that the uses has stopped art a frame in a region of interest the 
interaction is as follows: 

15 * If the 'exclude 1 button is pressed once the pare of die edit bsw: uod,ex the frame pointer 
goes red to indicate that only the current fbme is 'occluded'; the rest of the bar remains 
unchanged, 

• Xf the 'exclude 1 button Is pressed once more, a region corresponding to the support 
before and after the frame poster position goes red to indicate that this region is 

20 'excluded^ the rest of the bar remains unchanged. 

• If the 'exclude' button is pressed once more, a region from the start of the bar up to the 
pointer goes red to indicate that all the frames from the start of the video to the currept 
position ate f c*cluded\ 

- If the 'exclude 1 button is pressed once more, a region from the pointer to the end of the 
25 bar goes red to indicate tfoat all the frames from to the current position to the end of the 

video are excluded 

Further presses repeatedly cycle round the four above cases. 

According to another aspect of the invenrion means arc provided for the user to export the 
edited video message. 



In the preferred embodiment of this aspect of the invention the user operates die 'apply* 
button to export the edited video message. 

According to another aspect of .the invention means are provided for user to select further 
options prior to completion; . 

In the preferred embodiment of this aspect of the invention the user selects, through 
interaction with a menu, the fclkwiag: 

* add 'fades' where frames haw been deleted, 

* add 1 transitions' where frames have been deleted. 

* add a background music track 

* add text annotation. 

According to another aspect of the invention if any editing operation results in a single 
stationary frame being displayed to the user then this fame can be treated as a still image and 
processed separately. 

In the preferred embodiment of this aspect of the iaveutixjn the system monitor? the support 
for tiic currently displayed ft&me and, if this is equal to one, asks the user m a message box 
whether 'this frame is required as a stillj if the user replies 'yes 1 then die still is captured and 
stored, and the editing session can then proceed, 

Industrial Applicability 

As a simple example of the use of the invention consider this scenario. Using a buUt-fax 
camera a user of a mobile phone captures a short segment of video from a birthday party 
and wishes to trim the segment This trimming operation is wanted in order, both to focus 
in on the moment when the children blow out the candles on the birthday cake, and to 
minimise the cost of mailing the video segment to friends and family. The video segment is 
shuttled until the actual frame when the candles go out is displayed. The "include 1 button is 
pressed twice and the preparation tool, based on the past history of user interaction, 
determines that three seconds of video before and after the chosen frame should be Included 
in the edit The user runs to the start of the 'included' region and using the 'include' button, 
adds more frames to the final edit The uses then quickly runs forward and backward 
checking that green 'tick' markers appear in the part of the clip of interest; then tfac 'apply 1 



button is pressed and the editing process? is completed. The system measures the actual 
number of frames set as 'included' and updates the memory variables used for prediction. 
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Claims. 

What Is claimed is: 

1. A computer based system for preparing digital media that adds the capability for media- 
preparation tp media player wheiwo Mbxmatiojj. held in computer memory describes 

5 the interaction between the system and the user of the system and is retained between 

uses of the preparation system, which ^formation is used to predict the preferences of 
the uset during subsequent uses of the preparation system, and which information 
additionally is used to assist the user in performing a media preparation task, and which 
information additionally is updated as a result of the history of uses of the preparation 
10 system. 

2, The digital media preparation system of claim 1 comprising: 

Input interfaces for forward and backward video tcansporf, 
Input Laterfeces for including and excluding video frames; 
Input interfaces for applying the edit operation; 

15 A graphical output interface that schematicajjy represents the digital media in terms 

of regions representing frames to be included in, and regions teptesenting frames 
ihsit arc to be excluded from, the final edit 

A *&amc pointer* that marks the current frame on the schematic representation 

An 'include 1 graphic on the display that overlays a frame and which confirms that that 
20 frame is to be retained. 

A 'exclude 1 graphic on the display that overlays a frame and which confirms that that 
frame is to be discarded. 

One ot mote variables held in computer memory that hold information describing 
the interaction between the system and the user of the system. 

25 3, The digital media preparation system of claims 1 & 2 wherein the sizes and positions of 

the regions representing included frames arc modified using the input interface for 

including frames. 

4* The digital media preparation system of claims 1 & 2 wherein the sixes and positions of 
the regions representing excluded frames axe modified using the input interface for 
30 eluding frames. 
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5. The digital media preparation system of claims 1 - 4 wherein the input interfaces fos 
including and excluding fcam^ fbrwam and backed transport, and applying die edit, 
are implemented as posh buttons. 

6. The digital media preparation system of claims 1 ■ 4 wherein the input interfaces for 
including and excluding frames, forward and backward transport, and applying the edit, 
are implemented as graphics on a display. 

7. The digital media preparadon system of claims 1 - 6 whereto the sfcea and positions of 
the regions to be retained and discarded arc controlled uwng multiple button presses. 

8. The digital media preparation, system of claims 1 - 7 wherein one button press selects one 
frame only, that being the currently viewed frame, a second press selects a region centred 
on the current frame, a third press selects the region from the beginning of the digital 
media up to the current frame and a fourth press selects the region from current frame to 
the end of the digital media, the sequence then continuing to cycle as farther button 
presses are made. 

15 9. The digital media preparation system of claims 1 - 8 wherein the action of selecting a 

single frame nominates the selected frame as a still image which may then be subjected to 

further processing appropriate to the properties of sdU images. 
10. The digital media preparadon system of claims 1 - B where media includes, but is not 

confined to, video and audio. 
20 XL The digital media preparadon system of claims 1 - 8 where preparadon include, but is 

not confined to, the operations of one ok more of: 

Editings trimming; annotating; effects; transitions; appearance; presentation; 
12. The digital media preparadon system of claim 1 wherein the information held in 
computes memory determines the number of frames to set, in response to user 
25 interaction, as being included or excluded 

IS. The digital media preparation system of claim 12 wherein the information held in 
computer memory determines the number of frames to set, In responae to the first user 
interaction of a aession, as being included. 



12 

14. The digital media preparation system of claim 12 8c 13 whereirj the information held in 
computer maxkoty determines the number of frames to set, in response to user 
interactions other than the first of a session, as being included or excluded. 

15. The digital media preparation system of claim 12 - 14 wherein the information held la 
computer memory h updated as a result of user, ^teractlori. 
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